Analysis of the algorithm: From kernels to backup genes.

Kernelization section

The algorithm transformed the semantic similarity matrix to make it compatible with a kernel. Once this was done for each network and kernel type, it was integrated by kernel type. Below there is a general analysis of the properties of each matrix in the different phases of the process.

Annotations properties

Table 1. Annotation files descriptors

Net Min Max Average Standard_Deviation
biological_process 1 134 6.994822615755721 11.426508641027697
cellular_component 1 40 4.162222345933308 5.25157343549579
disease 1 21 2.2250479846449136 2.909050012799259
gene_PS 1.0 7.0 1.262828947368421 0.6094107569948181
gene_TF 1.0 61.0 1.556631455399061 1.065224706989774
gene_hgncGroup 1.0 8.0 1.1981323266441486 0.5234843380177688
genetic_interaction_effect_bicor 1081.0 1086.0 1085.8046559870922 0.9687934667201032
molecular_function 1 26 3.026420536486876 3.712318613679619
pathway 1.0 191.0 4.003825833485152 7.7291191049412005
phenotype 1 335 31.54500689383494 46.98185670201637
protein_interaction 1.0 7354.0 612.2996319549686 503.120287578834

Matrix properties

Table 2. Similarity matrixes

Net Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero
biological_process_sim 16994x16994 288796036 256327240
cellular_component_sim 17963x17963 322669369 320852982
disease_sim 4162x4162 17322244 17243154
gene_PS_sim 3020x3020 9120400 95698
gene_TF_sim 1044x1044 1089936 172858
gene_hgncGroup_sim 25136x25136 631818496 14457654
genetic_interaction_effect_bicor_sim 17354x17354 301161316 202089222
molecular_function_sim 17333x17333 300432889 296559336
pathway_sim 3429x3429 11758041 159182
phenotype_sim 5077x5077 25775929 25715036
protein_interaction_sim 18476x18476 341362576 11271602

Table 3. Filtered similarity matrixes

Net Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero
biological_process_sim 16994x16994 288796036 256327240
cellular_component_sim 17963x17963 322669369 320852982
disease_sim 4162x4162 17322244 17243154
gene_PS_sim 3020x3020 9120400 95698
gene_TF_sim 1044x1044 1089936 172858
gene_hgncGroup_sim 25136x25136 631818496 14457654
genetic_interaction_effect_bicor_sim 17354x17354 301161316 202089222
molecular_function_sim 17333x17333 300432889 296559336
pathway_sim 3429x3429 11758041 159182
phenotype_sim 5077x5077 25775929 25715036
protein_interaction_sim 18476x18476 341362576 11271602

Table 4. Uncombined kernel matrixes

Net Kernel Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero
biological_process ct 16994x16994 288796036 288796036
biological_process el 16994x16994 288796036 288796036
biological_process ka 16994x16994 288796036 256344234
biological_process rf 16994x16994 288796036 288796036
cellular_component ct 17963x17963 322669369 322669369
cellular_component el 17963x17963 322669369 322669369
cellular_component ka 17963x17963 322669369 320870945
cellular_component rf 17963x17963 322669369 322669369
disease ct 4162x4162 17322244 17322244
disease el 4162x4162 17322244 17322244
disease ka 4162x4162 17322244 17247316
disease rf 4162x4162 17322244 17322244
gene_PS ct 3020x3020 9120400 8927548
gene_PS el 3020x3020 9120400 5842664
gene_PS ka 3020x3020 9120400 98718
gene_PS node2vec 3020x3020 9120400 9120400
gene_PS rf 3020x3020 9120400 5842664
gene_TF ct 1044x1044 1089936 1089910
gene_TF el 1044x1044 1089936 1062998
gene_TF ka 1044x1044 1089936 173902
gene_TF node2vec 1044x1044 1089936 1089936
gene_TF rf 1044x1044 1089936 1062998
gene_hgncGroup ct 25136x25136 631818496 631304330
gene_hgncGroup el 25136x25136 631818496 326932198
gene_hgncGroup ka 25136x25136 631818496 14482790
gene_hgncGroup rf 25136x25136 631818496 326932198
genetic_interaction_effect_bicor ct 17354x17354 301161316 301161316
genetic_interaction_effect_bicor el 17354x17354 301161316 301161316
genetic_interaction_effect_bicor ka 17354x17354 301161316 202106576
genetic_interaction_effect_bicor rf 17354x17354 301161316 301161316
molecular_function ct 17333x17333 300432889 300432889
molecular_function el 17333x17333 300432889 300432889
molecular_function ka 17333x17333 300432889 296576669
molecular_function rf 17333x17333 300432889 300432889
pathway ct 3429x3429 11758041 11744123
pathway el 3429x3429 11758041 8641125
pathway ka 3429x3429 11758041 162611
pathway node2vec 3429x3429 11758041 11758041
pathway rf 3429x3429 11758041 8641125
phenotype ct 5077x5077 25775929 25775929
phenotype el 5077x5077 25775929 25775929
phenotype ka 5077x5077 25775929 25720113
phenotype rf 5077x5077 25775929 25775929
protein_interaction ct 18476x18476 341362576 341362576
protein_interaction el 18476x18476 341362576 341362576
protein_interaction ka 18476x18476 341362576 11233250
protein_interaction rf 18476x18476 341362576 341362576

Table 5. Integrated kernel matrixes

Integration Kernel Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero
integration_mean_by_presence ct 30164x30164 909866896 801364996
integration_mean_by_presence el 30164x30164 909866896 577538338
integration_mean_by_presence ka 30164x30164 909866896 373324882
integration_mean_by_presence node2vec 18669x18669 348531561 342608901
integration_mean_by_presence rf 30164x30164 909866896 577538338
mean ct 30164x30164 909866896 801364996
mean el 30164x30164 909866896 577538338
mean ka 30164x30164 909866896 373324882
mean node2vec 18669x18669 348531561 342608901
mean rf 30164x30164 909866896 577538338

Weight values